其他
s01 - Counting DNA Nucleotides
这是ROSALIND的题,全部是生物学话题,不会python的Y叔准备出个「跟Y叔学生信」或者是「Y叔解题学python」系列,不知道受不受欢迎,更新频率取决于受欢迎程度,让我看到你们的掌声!
Problem
A string is simply an ordered collection of symbols selected from some alphabet and formed into a word; the length of a string is the number of symbols that it contains.
An example of a length 21 DNA string (whose alphabet contains the symbols ‘A’, ‘C’, ‘G’, and ‘T’) is “ATGCTTCAGAAAGGTCTTACG.”
Given: A DNA string s of length at most 1000 nt.
Return: Four integers (separated by spaces) counting the respective number of times that the symbols ‘A’, ‘C’, ‘G’, and ‘T’ occur in s.
Sample Dataset
AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC
Sample Output
20 12 17 21
解答
这道题给出一段DNA序列,要求给出ACGT的频率,这个很容易,读文件,计数而已。
Python有count函数,直接帮我们计好数了。
FILE=open("DATA/rosalind_dna.txt", "r")
dna=FILE.read()
FILE.close()
print(dna.count("A") , dna.count("C"), dna.count("G"), dna.count("T"))
由于这道题太简单,我们不防用C也来写一段。
#include <stdio.h>
int main() {
FILE *INFILE;
INFILE = fopen("DATA/rosalind_dna.txt", "rt");
char nt;
int a_cnt, c_cnt, g_cnt, t_cnt;
a_cnt = c_cnt = g_cnt = t_cnt = 0;
while( (nt = fgetc(INFILE)) != EOF) {
switch(nt) {
case 'A':
a_cnt++;
break;
case 'C':
c_cnt++;
break;
case 'G':
g_cnt++;
break;
case 'T':
t_cnt++;
break;
}
}
printf("%d %d %d %d\n", a_cnt, c_cnt, g_cnt, t_cnt);
return 0;
}